Parameter tuning for fast speech recognition
نویسندگان
چکیده
We describe a novel method for tuning the decoding parameters of a speech-to-text system so as to minimize word error rate (WER) subject to an over-all time constraint. When applied to three sub-realtime systems for recognizing English conversational telephone speech, the method gave speed improvements of up to 21.1% while at the same time reducing WER by up to 6.7%.
منابع مشابه
Approaches to Parameter Tuning for ASR Systems: LVCSR and Expert
In this work we look at the impact and appeal of different approaches to parameter tuning for Automatic Speech Recognition systems. This is a significant issue in the development and comparison of practical as well as research-oriented ASR systems which is nonetheless rarely afforded attention commensurate with its importance. Here we provide a brief discussion of several popular automated appr...
متن کاملImproved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition
Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...
متن کاملBayesian context clustering using cross valid prior distribution for HMM-based speech recognition
Decision tree based context clustering [Young; '94] ・ Construct a parameter tying structure ・ Can estimate robust parameter ・ Can generate unseen context dependent models ・ Minimum description length (MDL) criterion [Shinoda; '97] Bayesian approach ・ Variational Bayesian (VB) method [Attias; '99] ⇒ Applied to speech recognition [Watanabe; '04] ・ Can use prior information ⇒ Affect context cluste...
متن کاملDeveloping a Standardized Medical Speech Recognition Database for Reconstructive Hand Surgery
Fast and holistic access to the patients’ clinical record is a major requirement of modern medical decision support systems (DSS). While electronic health records (EHRs) have replaced the traditional paper-based records in most healthcare organization, the data entry into these systems remains largely manual. Speech recognition technology promises substitution of the more convenient speech-base...
متن کاملA FAST FUZZY-TUNED MULTI-OBJECTIVE OPTIMIZATION FOR SIZING PROBLEMS
The most recent approaches of multi-objective optimization constitute application of meta-heuristic algorithms for which, parameter tuning is still a challenge. The present work hybridizes swarm intelligence with fuzzy operators to extend crisp values of the main control parameters into especial fuzzy sets that are constructed based on a number of prescribed facts. Such parameter-less particle ...
متن کامل